Main Result #1: Organization Balance decreases over time
main %>% ggplot(aes(x = date, y = as.numeric(balance))) +
geom_smooth(method = "gam") +
theme_min +
labs(y = "Balance", x = "Date")
ggsave(here("paper/figures/balance-overview-gam.png"), width = 6, height = 4)
Robust to SD as alternate outcome
main %>% ggplot(aes(x = date, y = cfscore.sd)) +
geom_smooth(method = "loess") +
geom_point(alpha=0.1) +
theme_min
Robust to varying imputation strategy
Overall, the negative time trend is robust to all of our imputation strategies. Our main specification uses matched cfscores and includes imputed cfscores when data is missing Other specifications including only using the organization classifications, the raw DIME data (no imputation), and exclusively using imputation tell a similar story
specs <- list(
list("types" = c("org_category"), cfscore.options = list(), label = "Category only"),
list("types" = c("org_category", "cfscore"), cfscore.options = list(impute.off = T, nwd.sd.thres = 0.8), label = "Category + raw cfscore"),
list("types" = c("org_category", "cfscore"), cfscore.options = list(nwd.sd.thresh = 0.8), label = "Category + imputed cfscore"),
list("types" = c("org_category", "cfscore"), cfscore.options = list(impute.all = T, nwd.sd.thresh = 0.8), label = "Category + total imputation")
)
df <- lapply(specs, \(t) {
sources %>%
calculate_balance(
types = t$types,
cfscore.options = t$cfscore.options,
) %>%
mutate(label = t$label)
}) %>% bind_rows()
df %>% ggplot(aes(x = date, y = as.numeric(balance), color = label)) +
geom_smooth(se = F) +
theme_min +
labs(x = "Date", y = "Balance", color = "Specification")
ggsave(here("paper/figures/balance-overview-robust-1.png"), width = 6, height = 4)
Robustness increases when dropping bad matches
We define a bad match as one where there the DIME ideology scores for an organization are widely varying, ie when the SD of the scores exceeds a certain threshold. Our main specification uses a threshold of 0.8 We can see how the relationship changes with different thresholds for determining “widely varying” The following chart shows that the negative time trend is increasingly apparent the more strictly we reject bad matches.
df <- lapply(seq(0.1, 1, 0.1), \(x) {
sources %>%
calculate_balance(
types = c("org_category", "cfscore"),
cfscore.options = list(nwd.sd.thresh = x),
) %>%
mutate(label = x)
}) %>% bind_rows()
df %>% ggplot(aes(x = date, y = as.numeric(balance), color = as.factor(label))) +
geom_smooth(se = F) +
theme_min +
labs(x = "Date", y = "Balance", color = "Specification")
ggsave(here("paper/figures/balance-overview-robust-2.png"), width = 6, height = 4)
Robustness to varying left/right threshold
Our measure of balance depends on an arbitrary threshold ‘t’ Organizations with cfscore > t and cfscore < -t are classified as “right” and “left” Our main specification uses a t=0.2 but results are robust to other specifications of t.
seq(0.1, 0.6, 0.1) %>%
lapply(\(t) {
sources %>%
group_by(textfile, date) %>%
calculate_balance(
types = c("org_category", "cfscore"),
cfscore.options = list(nwd.sd.thresh = 0.8),
thresh = t
) %>%
mutate(thresh = t)
}) %>%
bind_rows() %>%
ggplot(aes(x = date, y = as.numeric(balance), color = as.factor(thresh))) +
geom_smooth(se = F) +
theme_min +
labs(x = "Date", y = "Balance", color = "Threshold")
ggsave(here("paper/figures/balance-overview-robust-3.png"), width = 6, height = 4)
NOT ROBUST to including politicians or topics
df <- list(
c("org_category", "cfscore"),
c("org_category", "cfscore", "party"),
c("org_category", "cfscore", "topic"),
c("org_category", "cfscore", "party", "topic")
) %>%
lapply(\(t) {
sources %>%
calculate_balance(
types = t,
) %>%
mutate(types = paste0(t, collapse = ","))
}) %>%
bind_rows()
df %>%
ggplot(aes(x = date, y = as.numeric(balance), color = as.factor(types))) +
geom_smooth(se = F) +
theme_min +
labs(x = "Date", y = "Balance", color = "Threshold")
Politicians and SD
Interesting pattern with SD. Parabolic shape…
specs <- list(
list("types" = c("org_category", "cfscore"), cfscore.options = list(nwd.sd.thresh = 0.8), label = "Default"),
list("types" = c("org_category", "cfscore"), cfscore.options = list(nwd.sd.thres = 0.8, pols = T), label = "Include Politician CFscore")
)
df <- specs %>%
lapply(\(t) {
sources %>%
calculate_balance(
types = t$types,
cfscore.options = t$cfscore.options,
) %>%
mutate(label = t$label)
}) %>%
bind_rows()
df %>% ggplot(aes(x = date, y = as.numeric(cfscore.sd), color = label)) +
geom_smooth(method = "loess") +
theme_min +
labs(x = "Date", y = "SD", color = "model")